Intraspecific variability (IV) is often seen as an unstructured, intrinsic property of individuals, mostly genetic. Here, we investigate whether it can result from the species responses to the spatial variations of the environment. Our hypothesis is that the individuals can be clones, i.e. have no genetic differences between them, and still have different measured attributes because the environment in which they strive varies, as shown in the previous analysis of a clonal dataset. The response of an individual results from the integrated effects of the environmental conditions it has encountered during its life (local environmental variation, or microsite effect) and its genetic features. Moreover, the differences between individuals due to environmental variation does not imply a broader overlap of the species niches. Therefore, this type of IV has no effect on species coexistence directly. We designed and conducted a virtual experiment that aims at illustrating that IV, or “individual effects,” can result from unobserved variation in some environmental variables [1], therefore accounting for multidimensional species differences which cannot be observed on a few niche axes.
To do so, we first considered a model that depicts the response of a process, e.g. growth, for individual clones of two different species to all the environmental variables that influence it. Henceforth denoted the “Perfect knowledge model,” it represents the perfect knowledge of the process as it occurs in the field and thus includes no residuals :
\(Y_{ijt} = \beta_{0j} + \beta_{1j} * ln(X_{1ijt}) + \beta_{2j} * X_{2ijt} + \ldots + \beta_{Nj} * X_{Nijt}\) “Perfect knowledge model”
where \(1 \leq i \leq 500\) are the individuals ; \(1 \leq j \leq 2\) are the species ; \(Y\) is a response variable, for instance growth ; \(X1\) to \(XN\) are explanatory variables, for instance environmental variables, that can vary with time \(t\) ; and the \(\beta_{k,i}\) , \(1 \leq k \leq N\) depict the species-specific responses to these environmental variables. As we consider clones, individuals of the same species respond the same way to environmental variables, and variation in \(Y\) among individuals within species is due to differences in the environment where each individual thrives. The first explanatory variable was natural log-transformed to represent a log-log relationship as would be the relationship between tree growth and light.
We fix the species parameters of the “Perfect knowledge model” as follows, using 10 environmental variables (N=10) :
Parameters of species 1 : \(\beta_0\) = 0.25, \(\beta_1\) = 0.15, \(\beta_2\) to \(\beta_{6}\) are chosen randomly between -0.05 and 0.05 and \(\beta_{7}\) to \(\beta_{10}\) are chosen randomly between 0 and 0.05.
Parameters of species 2 : \(\beta_0\) = 0.2, \(\beta_1\) = 0.1, \(\beta_2\) to \(\beta_{10}\) are chosen the same way as for species 1.
The difference between the species is imposed by those parameters. Here, species 1 is more competitive on average thanks to its higher intercept (\(\beta_0\)) and reaction to the first environmental variable (\(\beta_1\)). The first environmental variable (\(X1\)) has a higher weight in the computation of the response variable, as would be the most limiting factor for the response variable.
To account for genetic variability in our generated data, we added an IV in species parameters by sampling each individual parameter in a normal distribution centered on the species mean parameter and with a standard deviation of a quarter of the species mean parameter.
To represent the spatialised environment of such a forest plot, we build a 2D matrix of dimension 500 \(\times\) 500 for each environmental variable at a time \(t_0\), by randomly generating them with spatial autocorrelation. We here considered that all environmental variables were independent. Each variable was simulated by using the gstat R package [2,3], enabling to create autocorrelated random fields. A spherical semivariogram model was used for each of the ten environmental variables, with a mean of 0 for each explanatory variable (beta = 0), a sill of 1 (psill = 1) for each and a range of 50 (range = 50).
We then considered that \(Y\) had been measured at two times, \(t_0\) and \(t_1\), for each individual as it would be under periodic forest plot censuses for instance. A pair of coordinates within this spatialised environment was randomly assigned to each of the 500 individuals. We considered that some of the environmental variables (light, temperature, humidity, nutrient availability for instance) had changed between \(t_0\) and \(t_1\) and others had not (slope, altitude for instance). For the first environmental variable which had the stronger impact on \(Y\) values and another randomly drawn environmental variable, we computed values at \(t_1\) as \(t_0 + \epsilon\), \(\epsilon \sim \mathcal{N}(0, 0.1)\) (they randomly increase or decrease). For two other environmental variables randomly drawn, \(X_{t_1} = X_{t_0} + |\epsilon|, \epsilon \sim \mathcal{N}(0, 0.1)\) (they increase) and for two other environmental variables, \(X_{t_1} = X_{t_0} - |\epsilon|, \epsilon \sim \mathcal{N}(0, 0.1)\) (they decrease).
This led to two repeated measurements of \(Y\) for each individual of each species, i.e. 2000 values of \(Y_{i,j,t}\), with \(i=[1;2]\) ; \(j=[1,50]\) ; \(t=[t_0;t_1]\).
Figure 1.1: Histogram of raw and log-transformed Y with and without genetic IV
Figure 1.2: Raw and log-transformed Y versus X1 plots with and without genetic IV
Figure 1.1 shows the distribution of the data and Figure 1.2 shows the relationship between the response Y and the environmental variable X1.
Figure 1.3: Virtual landscape representing X1 and the individual with their respective Y value.
Figure 1.3 shows that microsite effects, which are the effects of the local multidimensional environment on the observed variable, can result in local reversals of the competitive hierarchy between species. Indeed, the local outcome of competition can be opposite to the mean hierarchy : at one point of the space-time, an individual of Species 1 can overcome an individual of Species 2, whilst Species 1 is, on average, the fittest of the two species. Consequently, microsite effects foster the coexistence of Species 1 and Species 2.
These two virtual datasets (with and without genetic variability) were then analysed with a “Partial knowledge model,” which represents the process understanding ecologists can derive using these datasets and from their incomplete characterisation of the environment : only a few variables (here only one, \(X_1\)) are actually measured and taken into account.
\(\ln(Y_{ijt}) = [\beta'_{0j} + b_{0i}] + \beta'_{1j} \ln(X_{1ijt}) + \varepsilon_{ijt}\) “Partial knowledge model”
Priors :
\(\varepsilon_{ijt} \sim \mathcal{N}(0,V_j)\), \(j = [1, 2]\), iid
\(\beta'_{kj}\sim \mathcal{N}_2(0, 1)\), \(k = [0, 1]\), \(j = [1, 2]\), iid
\(b_{0i} \sim \mathcal{N}(0, V_{bj})\), \(i = [1, 500]\), \(j = [1, 2]\), iid
Second level priors :
\(V_j \sim \mathcal{IG}(10^-3, 10^-3)\), \(j = [1, 2]\), iid
\(V_{bj} \sim \mathcal{IG}_2(10^-3, 10^-3)\), \(j = [1, 2]\), iid
This model includes a random individual effect on the intercept (\(b_{0i}\)) allowing to account for any variability among individuals within species regarding this parameter. We ran this model twice, with the datasets generated with and without genetic IV. We then acquired the species parameters and the IV generated with genetic IV and get only IV from the perfect knowledge model. These models were fitted with Bayesian inference thanks to Stan, using the brms package [4,5] in Rstudio, with 100000 iterations, a warming period of 50000 iterations and a thinning of 50, thus we finally obtain 1000 estimates per parameter.
We visualised the convergence and the results of the models thanks to trace and density plots.
## No divergences to plot.
Figure 2.1: Trace of the posterior of the model for Species 1 without genetic IV
Figure 2.2: Density of the posterior of the model for Species 1 without genetic IV
## No divergences to plot.
Figure 2.3: Trace of the posterior of the model for Species 2 without genetic IV
Figure 2.4: Density of the posterior of the model for Species 2 without genetic IV
## No divergences to plot.
Figure 2.5: Trace of the posterior of the model for Species 1 with genetic IV
Figure 2.6: Density of the posterior of the model for Species 1 with genetic IV
## No divergences to plot.
Figure 2.7: Trace of the posteriors of the model for Species 2 with genetic IV
Figure 2.8: Density of the posteriors of the model for Species 2 with genetic IV
## Warning: Parts of the model have not converged (some Rhats are > 1.05). Be careful when analysing the results! We recommend running
## more iterations and/or setting stronger priors.
| \(\beta_0\) | \(\beta_1\) | \(V_b\) | \(V\) | |
|---|---|---|---|---|
| Species 1 | ||||
| Estimate | 6.4e-02 | 2.9e-01 | 9.5e-02 | 1.9e-03 |
| Estimation error | 4.7e-03 | 2.4e-03 | 3e-03 | 6.1e-05 |
| Species 2 | ||||
| Estimate | 1.3e-01 | 1.5e-01 | 7.6e-02 | 5.2e-04 |
| Estimation error | 3.3e-03 | 6.3e-04 | 2.4e-03 | 1.7e-05 |
We inferred a high IV even in the absence of genetic IV (2.1). Therefore, observed IV does not necessarily reveal intrinsic (mainly genetic) IV, but can also reveal hidden dimensions of the environment. The mean and quantiles of the results of the models were used to visualise the inferred link between \(Y\) and \(X1\). To do so, we created a sequence of explanatory variable values and computed the response variable with the mean and quantiles of the parameters inferred with the models.
Figure 2.9: Plot of the real values - points - and estimated mean - bold line - and 95 % confidence interval - thin lines - of Y versus X1. The dotted lines correspond to the 95 % interval due to genetic IV.
Figure 2.10: The virtual landscape of X1 values and of Y values and the corresponding estimated relationship between X1 and Y.
In Figure 2.9 and panel A of Figure 2.10, the solid and bold lines represent the mean rate of the response variable (e.g. growth) of Species 1 (blue) and Species 2 (orange) as computed with the parameters retrieved from the model with or without genetic variability respectively. The plain lines represent the 95% interval of the posteriors from the model without genetic variability and the dotted lines show the 95% confidence interval of the posteriors from the model with genetic variability. Figure 2.9 shows that genetic IV increases the overlap between the response of Species 1 and Species 2.
Figure 2.10 represents the virtual landscape of \(X_1\) and the corresponding individual response and the plot of \(Y\) versus \(X1\) (without genetic variability) next to each other, helping to visualise this virtual experiment.
In the model without genetic IV, the unobserved variation in the environment resulted in an important “individual effect.” The performances of the two species can intersect, which means that the competitive hierarchy among the two species can be locally reversed depending on the micro-environmental variation, offering opportunities for the coexistence of the two species in a variable environment. In that sense, incorporating the individual level in statistical models can help to explain the coexistence of species through a better integration of niche multidimensionality in models.
We then analysed the spatial structure of individual the response variable. We hypothesised that as environmental variables are spatially autocorrelated, and the response variable is the result of these variables, then the response variable should also be spatially autocorrelated. To test this, we computed Moran’s I test. It is computed with the Moran.I function of the ape R package [6].
| Moran’s I index | P-value | |
|---|---|---|
| Species 1 | 6.7e-02 | 0e+00 |
| Species 2 | 6.9e-02 | 0e+00 |
| All individuals | 4.3e-02 | 0e+00 |
Table 3.1 shows that the individual response variable is largely spatially autocorrelated. This is due to the spatial autcorrelation in the environmental variables. In this simple, completely controlled experiment, this is natural. However, in a less controlled environment, detecting spatial autocorrelation in a response variable could be the sign of the spatial structure of the underlying environmental variables although other factors like the genetic structure could lead to spatial autocorrelation in the response variable.
Finally, we used semivariograms to visualise the spatial autocorrelation of the response variable and to test whether the individual growth was more similar within conspecifics than within heterospecifics. The semivariograms were computed and modelled with the variogram and the fit.variogram functions of the gstat R package ([3]) respectively. The variogram models were spherical.
Figure 4.1: Estimated and empirical semivariograms of Y for each species and both species together.
As the semivariance between individuals of both species was higher than the semivariance between conspecifics (4.1), the individual response variable was more similar between conspecifics than heterospecifics. Considering this response variable as a proxy of fitness, and linking fitness to competition, we argue that in a Lotka-Volterra model, this would translate into \(\alpha_{i,i} > \alpha_{i,j}\), the main condition for stable coexistence. Therefore, stable coexistence is possible even with high intraspecific variability, especially when this variability is not intrinsic but due to environmental heterogeneity that is structured in space.
The whole analysis is done using the R language [7] in the Rstudio environment [8]. The tables are made with the kableExtra package [9], the figures with the package ggplot2 [10], and the code uses other packages of the Tidyverse [11] (dplyr [12], magrittr [13]) and other R packages (here [14], ggpubr [15], sp [16,17], ggnewscale [18], gstat [3]). The pdf and html documents are produced thanks to the R packages rmarkdown [19–21], knitr [22–24] and bookdown [25].